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Abstract 

We discuss basic statistical properties of systems with multifractal structure. This is possible by 
extending the notion of the usual Gibbs-Shannon entropy into more general framework - Renyi's 
information entropy. We address the renormalization issue for Renyi's entropy on (multi)fractal 
sets and consequently show how Renyi's parameter is connected with multifractal singularity spec- 
trum. The maximal entropy approach then provides a passage between Renyi's information entropy 
and thermodynamics of multifractals. Important issues such as Renyi's entropy versus Tsallis- 
Havrda-Charvat entropy and PDF reconstruction theorem are also studied. Finally, some further 
speculations on a possible relevance of our approach to cosmology are discussed. 
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I. INTRODUCTION 

The past two decades have witnessed an explosion 
of activity and progress in both equilibrium and non- 
equiUbrium statistical physics. The catalyst has been the 
massive infusion of ideas from information theory, the- 
ory of chaotic dynamical systems, theory of critical phe- 
nomena, and quantum field theory. These ideas include 
the generalized information measures, quasi-periodic and 
strange attractors, fully developed turbulence, percola- 
tion, renormalization of large-scale dynamics, and at- 
tractive, albeit speculative, ideas about quark-gluon 
plasma formation and dynamics. It is the purpose of this 
paper to proceed in this line of development. The issue 
at the stake is what modifications in statistical physics 
one should expect when dealing with systems with var- 
ied fractal dimension - multifractals. The view which 
we present here hinges on two mutually interrelated con- 
cepts, namely on Renyi's information entropy [3,4] and 
(multi)fractal geometry. In this connection we would like 
to stress that in order to exhibit the link between Renyi 
information entropies and (multi)fractal systems as gen- 
erally as possible we do not put much emphasize on the 
important yet rather narrow class of (mult i) fractal sys- 
tems - chaotic dynamical systems. 

One of the fundamental observations of information 
theory is that the most general functional form for the 
mean transmitted information (i.e., information entropy) 
is that of Renyi. In Section II we briefly outline Renyi's 
proof and discuss some fundamentals from information 



theory which will show up to be indispensable in fol- 
lowing sections. We show that with certain mathemat- 
ical cautiousness Shannon's entropy can be viewed as a 
special example of Reny's entropy in case when Renyi's 
parameter a ^ 1. We also address the question of the 
status of Tsallis-Havrda-Charvat (THC) entropy [1,2] in 
the framework of information theory. 

Although Renyi's information measure offers very nat- 
ural - and maybe conceptually the cleanest - setting for 
the entropy, it has not found so far as much applicabil- 
ity as Shannon's (or Gibbs's) entropy. The explanation, 
no doubt, lies in two facts; ambiguous renormalization of 
Renyi's entropy for non-discrete distributions and little 
insight into the meaning of Renyi's a parameter. Sur- 
prisingly little work has been done towards understand- 
ing both of the former points. In Section III we aim to 
address the first one. We choose, in a sense, a minimal 
renormalization prescription conforming to the condition 
of additivity of independent information. Renyi's entropy 
thus obtained is then directly related to the information 
content ( "negentropy" ) . 

To clarify the position of Renyi's entropy in physics, 
or in other word, to find the physical interpretation for 
a parameter, we resort in Section IV to systems with a 
multifractal structure. Such systems are very important 
and highly diverse, including the turbulent flow of flu- 
ids [5,6], percolations [7], diffusion-hmited aggregation 
(DLA) systems [8], DNA sequences [9], finance [10], and 
string theory [11]. Using the reconstruction theorem we 
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argue that in order to obtain a "full" information about a 
(multi) fractal system wc need to know Rcnyi's entropies 
to all orders. Still, for discrete spaces and simple metric 
spaces (like R'^) we find that the contribution from Shan- 
non's entropy dominates over all other Renyi entropies. 
We further show that from the maximal entropy (Max- 
Ent) point of view, extremizing the Shannon entropy on a 
multifractal is equivalent to extremizing directly Renyi's 
entropy without invoking the multifractal structure ex- 
plicitly. Application of this result to a cosmic strings 
network will be presented elsewhere [12]. 

We close with Section V where we present some specu- 
lations on the relevance of the outlined approach to string 
cosmology and quantum mechanics. For reader's conve- 
nience we supplement the paper with eight appendices 
which clarify some finer mathematical manipulations. 

II. RENYI'S ENTROPY OF DISCRETE 
PROBABILITY DISTRIBUTIONS 

A. Renyi's entropy and information theory 

We begin this section by summarizing the information 
theory procedure leading to Renyi's entropy [3,4]. This is 
of course well known but it may be useful to repeat it here 
in order to make our discussion self-contained. We will 
also need to generalize it when considering THC entropy 
in Section IID and axiomatization of Renyi's entropy in 
Appendix B . 

Let us start with a discrete probability distribution 
= {PiiP2, . . . ,Pn} fulfilling usual conditions 

Pk>0, ^pfe = l. (2.1) 

k 

We then assume three things about information. Firstly, 
information should be additive for two independent 
events. Secondly, information should purely depend on 
P. These two condition can be also formulated in the fol- 
lowing way: if we observe the outcome of two indepen- 
dent events with respective probabilities p and q, then 
the total received information is the sum of two partial 
ones. Therefore the following functional equality holds: 

Iipq)=I{p)+Iiq). (2.2) 

The latter is well known modified Cauchy's functional 
equation [13] which has (under fairly broad assumptions 
[4,14]) unique class of solutions - Klog2(. ..). The con- 
stant K is then fixed via appropriate "boundary" condi- 
tion. Setting X(l/2) = 1 we obtain the, so called, Hartley 
measure of information [15]. So the amount of informa- 
tion received by learning that event of probability p took 
place equals 

I{p) = -log2(p) . (2.3) 



The third assumption is that if different amounts of in- 
formation occur with different probabilities, the total 
amount of information is the average of the individual 
information weighted by the probabilities of their occur- 
rences. In general, if the possible outcomes of an exper- 
iment are Ai,A2,--- ,An with corresponding probabili- 
ties pi,p2, . . . ,Pn, and Ak conveys Ik bits of information, 
then the total amount of information conveyed would be 

n 

I{V,Q) = J2PkIk, (2.4) 

k=l 

where 3 = {Ji, ■ ■ ■ i^n}- However, the linear averag- 
ing implemented in (2.4) is only a specific case of a more 
general mean. If / is a real function having an inverse 
then the number 

(jlPkfM^ . (2-5) 

is called the moan value of xi, 2:2, . . . , Xn associated with 
/. As shown in Refs. [16-18], (2.5) prescribes the most 
general mean compatible with postulates of probability 
theory (sec, eg., [3]). The function / is often referred to 
as Kolmogorov Nagumo's function . 

Former analysis suggests that in the most general case 
the measure of the amount of transmitted information 
should admit the form 

I{V,Q)=f-' (j^Pk f {-logM^j ■ (2.6) 

The natural question arises, what is the possible math- 
ematical form of /, or in other words, what is the most 
general class of functions / which will still provide a mea- 
sure of information compatible with the additivity pos- 
tulate. Obviously for a given set of outcomes, many pos- 
sible means can be defined, depending on which features 
of the outcomes are of interest. It comes therefore as 
a pleasant surprise to find that the additivity postulate 
allows only for two classes of f's - linear and exponen- 
tial functions. The proof of this statement is simple and 
clarifies a good deal about / so for the future reference 
we sketch its main points. Alternative proof based on 
scaling argumentation is presented in Appendix A. 

Let an experiment /C be a union of two independent ex- 
periments /Ci and /C2. Let further assume that we receive 
xj^^ bits of information with probability pk connected 
with /Ci and X^-* bits of information with probability qi 

connected with /C2. As a result we receive X^^^^ bits 
of information with probability PkQi- We assume further 
that there is m possible outcomes in /Ci experiment (i.e., 
k = 1,2, ... ,m) and n possible outcomes in JC2 exper- 
iment (i.e., I = 1,2, ... ,n). Invoking the postulate of 
additivity we may write 
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The former must hold completely generally irrespective 
of our choice of P = {pi, . . . ,Pm}, Q = {qi, ■ ■■ ,qn} and 
irrespective of the actual choice of independent experi- 
ments /Ci, /C2. So if we choose = X independently of 
k we obtain from (2.7) 

(m \ 

Let us denote fy{x) = f{x + y) (so namely f~^{x) —y = 
fy^{x)). Thus Eq.(2.8) may be recast into the form 

(2.9) 

So functions fx and / generate the same mean. It is well 
known in theory of means (see eg., [19]) that Eq.(2.9) 
holds only if fx is a linear function of /. So we have 



fxiz) = fiz+l) = ail)f{z) + bil). 



(2.10) 



Here a{. ■ .) and b{. . .) are independent of z. Without 
loss of generality we shall assume that /(O) = (or oth- 
erwise we adjust b). As a result b{I) = f{I). Therefore 



f{z+I)=a{I)f{z) + fiI) 
f{z+I)=a{z)f{I) + f{z), 



(2.11) 



where the second line was obtained by a simple inter- 
change z ■f-^ X. Subtraction of both lines in (2.11) leads 
to the following separation of variables (z 0, X ^ 0): 



a{z) — 1 a{X) — 



= 7. 



(2.12) 



(7 is a constant independent both of z and X). The so- 
lution of (2.12) has a simple form 



a{x) - l = jf{x). 



(2.13) 



Note that (2.13) holds true also for x = 0. In connection 

with (2.13) it is useful to distinguish two cases; 7 = 
and 7 0. In the first case when 7 = 0, a{x) = 1 and 
we get Cauchy's functional equation [13] 



f{z+X) = f{z) + f{X), 



(2.14) 



which for z,X G M. has the well known solution: f{x) = 
cx with the nonzero constant c. This is in a sense the 



most elementary Kolmogorov Nagumo function. Plug- 
ging the latter into Eq.(2.6) the measure of transmitted 
information boils down to Shannon's measure 



I(7',9)=-^Pfel0g2(Pfe)=W- 



(2.15) 



fe=i 



In the second case when 7 7^ 0, a{x) fulfills the modified 
Cauchy's functional equation [13] 



a(z +X) = a(z)a{X) , 



(2.16) 



which for continuous a{. . .) and z, J e R has only expo- 
nential solutions. Thus we may generally write: a{x) = 
2(i-a)x -^{1]^ a ^ 1 being some constants. As a result we 
get f{x) = [2(1-")^ - l]/7. Plugging this into Eq.(2.6) 
the measure of transmitted information will be 




(1-a) 



(2.17) 



The information measure (2.17) is usually called the gen- 
eralized information measure or information measure of 
order a, or simply Renyi's entropy. We will denote the 
explicit order of Renyi's entropy as a subscript in X{. . .). 

Although the foregoing operational (pragmatic) way 
of arguing is quite robust, some readers may find more 
justifiable to see Renyi's entropy properly axiomatized. 
Actually, the Shannon entropy was firstly axiomatized by 
Shannon [20] and then later some axioms were weakened 
(or substituted) by Fadeev [21], Khinchin [22] and sev- 
eral other authors [23] . The Renyi entropy was axioma- 
tized by Renyi himself [3,4] and afterwards sharpened by 
Darotzy [24] and others [25]. In further considerations 
we will find, however, useful to use a slightly different 
set of axioms than those utilized in [3,4,24,25]. In fact, 
in Appendix B we show that the information measures 
(2.15) and (2.17) can be characterized by the following 
axioms: 

1 . For a given integer n and given V = {pi,p2, ■ ■ ■ tPu} 
{Pk > 0, = 1), X('P) is a continuous with 
respect to all its arguments. 

2. For a given integer n, X{p\,p2, ■ ■ ■ ,Pn) takes its 
largest value for pk = \/n {k = 1,2, ...,n) with 
the normalization X (i, i) — 1. 

3. For a given a G R; X{AnB) = X{A) +X{B\A) with 

X{B\A) = /-I (E, gk{a)f{X{B\A = A))), 

and Qkia) = {pk)"" / J2k(Pk)°' (distribution V cor- 
responds to the experiment A). 

4. / is invertible and positive in [0,oo). 

5. X{pi,p2,...,Pn,0) =X{pi,p2,...,Pn), i-c., adding 
an event of probability zero (impossible event) we 
do not gain any new information. 
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B. Some observations about Renyi's entropy 



C. Renyi's entropy and Shannon's entropy 



Before going further let us observe some key charac- 
teristics of Renyi's entropy which will prove essential in 

following sections. 

(a) Ia{B\A) appearing in the axiom 3 can be viewed 
as conditional information. In fact, in Appendix C we 
show that Ja{B\A) = iff outcome A uniquely de- 
termines outcome B. Wc also show that when A and 
B are independent then Xa{B\A) = Ia{B) and hence 
Ta{A n S) = Ia{A) + Icy{B), as expected. Alas the re- 
verse implication (i.e., la{B\A) = Ta{B) A and B are 
independent) generally holds only when B has uniform 
distribution. 

(b) It is interesting to note that we can write (with a 
bit of hindsight) in the axiom 3 

Ia{B\A) = f-' (^gk{a)f{MB\A = Ak))^ . 

Similarly, we can write Eq.(2.6) as 



This indicates that when the constituent information of 
order a enter a mean value calculation they must be 
weighted by gk{ays and not PkS, and this should hold 
true whatever the Kolmogorov-Nagumo function is. The 
former result may be generalized in the foUowing way: 
Whenever outcomes of a measurement carry an informa- 
tion of order a they must be weighted with Qk{a)- When 
outcomes actually carry information of order a will be 
discussed in Section IV B. 

(c) Another important property of Renyi's entropy is 
its concavity for a < 1 (for a > 1 Renyi's entropy is 
not purely convex nor purely concave). This a simple 
consequence of the fact that both log2(a;) and {a < 1) 
are concave functions (while is convex for a > 1). 

(d) A notable point which we will use in Section IV is 

that Xa is a monotonous decreasing function of a. This 
might be seen from the inequality 

— = ^-l^{-log2(^^-")„ + (log,P^-«)„} 



da 



(1 
< 0. 



ay 



(2.18) 



Here the expectation value (. . .)„ is defined with respect 
to the distribution gk{a). The last line of (2.18) is due 
to Jensen's inequality and due to concavity of log2(a;). 
Note that 6X^1 da = only when the Jensen inequality 
used in the derivation (2.18) is an equality. This happen 
iSV = const, (see e.g., [19]), or in other words when V is 
uniform. Consequently cither X^ is a strictly monotonous 
decreasing function of a or all Xa are identical. One never 
finds, for example, < la^ = for ai > Q!2 > 0:3. 



Now we turn to the investigation of the information 

measure of order 1. An important clement in this discus- 
sion is the fact that Xa is analytic in a = 1. This can be 
seen by continuing the index a into the complex plane 
and inspecting the behavior of logg (X^^'^i vV) fo'" 2 S C. 
The former is analytic provided that X)fc=i-Pfc ^"^^ lay- 
ing on the negative real axis. Let us now consider the 
situation where z = l + r e^'^ (i.e., we draw a circle with 
the radius r centered at z = 1). Thus log2 (X]fc=iPfc) 
analytic throughout the entire complex plane except the 
regions where the following two conditions hold 



sin (r sin ip \n(jpk)) = , 

k=l 
n 

Epr"*'^'cos(rsin^ln(pfc)) < 0. (2.19) 



fc=i 



Let us put r < \n/ (2 ln(pfe)i„in)|- Then evidently for such 
r's the conditions (2.19) cannot be fulfilled together and 
we are safely in the analyticity region. Consider the con- 
tour integral 



dz 



log2 (ELiPfc) 



1 



j>dzI^{V), (2.20) 



around a contour z = 1 -|- r e*"^, </? G [0, 27r). The residue 
theorem assures then that (2.20) vanishes and as a result 
Rcnyi'e entropy is analytic everywhere inside the contour 
(so also at 2: = 1). This shows that the singularity of 
Xq('P) at a = 1 is only spurious and, in fact, Renyi's 
entropy is differentiable at a = 1 to all orders. Using the 
Cauchy formula we can directly write 

' 277/ {1-Z){1-Z) 

= J-ldz f—^— 
2m J ^ \dz {z - 1) 

* dz 




277 7 {Z-1)EI=1PI 

n 

= -^Pk^ogM=n'P), (2-21) 

k=l 

where the contour of integration is the same as in the case 
(2.20). It is usually argued that it is a matter of modifi- 
cation of one of Shannon's axioms to get Renyi's entropy. 
We, however, do not intend to follow this path simply be- 
cause the Shannon entropy, as we have just seen, can be 
uniquely determined from the behavior of (analytically 
continued) Renyi's entropy in the vicinity of 2; = 1. In 
fact, we even do not need to be in the vicinity because 
the circle used in the contour integral (2.21) can be ana- 
lytically continued to any curve which lies in the 1st and 
4th quadrant and which encircles the point z = 1. View 
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which wc intend to advocate here is that the Shannon 
entropy is not a special information measure deserving 
separate axiomatization but a member of a wide class of 
entropies embraced by a single unifying axiomatics. 

An important consequence of the fact that la is a 
monotonous decreasing function of a is embodied in the 
following two inequalities 



H <Xa < log2 n, < a < 1 . 
Xa <H < log2 n, a> 1 . 



(2.22) 
(2.23) 



Inequality (2.23) shows that H represents an upper 

bound for all Rcnyi entropies with a > 1. This finding 
will play an important role in the reconstruction theorem 
in Section IV B. 



D. Renyi's entropy and THC entropy 

Due to an increasing interest in long-range corre- 
lated systems and non-equilibrium phenomena there has 
been currently much discussed the, so called, Tsallis (or 
non-extensive) entropy. Although firstly introduced by 
Havrda and Charvat in the cybernetics theory context [f ] 
it was Tsallis [2] who exploited its non-extensive features 
and placed it in a physical setting. THC entropy reads 



1 



(1 



Lk=l 



a > 0. 



(2.24) 



The most important properties of THC entropy can 
be easily read out of (2.24). For instance, employing 
Jensen's inequality we have for a > 1 that X^fePfe — 1 
(while for < a < 1 the reverse inequality holds) and 
hence iS^ is non-negative. Similarly, choosing any pair 
of distributions V and Q, and a real number < A < 1 
we have 

Sc.iXr + {l-X)Q) = XSaiV) + il-X)SaiQ), (2.25) 

and so THC entropy is a concave function of its prob- 
ability distribution. Eq.(2.25) results from Jensen's in- 
equality a concavity of .t" /(I — a). In addition, by rule 
of I'Hospital we get that 



lim <Sq 

a— »1 



lim Ta 

a— >1 



n. 



(2.26) 



Thus in the a ^ 1 limit THC entropy reduces to Shan- 
non's entropy. 

Perhaps the most distinguished feature of THC en- 
tropy is the so called pseudo additivity [2,27] 

SaiAnB) = Sa{A) + Sa{B\A) + (1 - a)SMSa{B\A) , 

for two experiments A and B, Sa{B\A) represents here 
the conditional THC entropy. Remarkable, albeit not 
yet understood aspect of the pseudo-additivity is that in 



the case of independent experiments THC entropy is not 
additive. Interested reader may find further discussion of 
THC entropy, for instance, in Ref. [28]. 

Now we turn to the problem of finding the connection 
between Renyi's and THC entropy. To this end we utilize 
the identity 



T = 



(1-a) 



1 

k Jo 



log2 [(1 - a)S^ + 1] 
1 



dx 



1 + a;(l - a) ■ 



(2.27) 



Here fc = ln2 is the scale factor. For ](1 — a)5a] < 1 we 
may expand the integrand in (2.27). In such a case the 
(geometric) series is absolutely convergent and we can 
integrate it term by term: 



2k 



il-a)Sl + 0[il-afSl] . (2.28) 



So apart from an unimportant factor k (which just sets 
the scale for entropy units) we see that la ~ <Saj provided 



(1 - a)Sa 



< 1. 



(2.29) 



It should be understood that the expansion (2.28) is not 
necessarily the expansion in (1 — a). In fact, condition 
(2.29) may be fulfilled in numerous ways. Obviously, for 
awl the inequality (2.29) is trivially satisfied. This 
should be expected because both X„ and Sa tend to the 
same limit value at a ~ 1. Thus the actual error estimate 
in this instance can be written as 



k 



(2.30) 



and so the true inaccuracy in dealing with .S^ and not 
la is of order (a — 1). There is, however, possible to 
pinpoint other very important classes of systems with 
a 76 1 still obeying (2.29). Clearly, various improved 
estimates can be devised if some additional assumptions 
are made about the system. One particularly important 
case which is pertinent to a < 1 region, namely the case 
of large deviations will be briefly discussed now. 

Systems with large deviations prove fruitful in many 
areas of physics and mathematics ranging from fluid dy- 
namics and weather forecast to population breeding. To 
proceed we will appeal to Loeve (or basic) inequality of 
probability theory [29]. Let X be an arbitrary random 
variable and let g be an even function on M and non- 
decreasing on [0, 00). Then for Va > 

{g{X)) - g{a) < supg{X) P[ \X\ > a] . (2.31) 

Upon taking the distribution g{q) = {(pk)'' / J2k(Pk)'^}' 
q e [0, 1] and g{x) = ja;]"-", a e [0, 1] we get from (2.31) 

(IX]"-") -a°'-i < sup{\X\"-'')P[\X\>a\. (2.32) 
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Here (. . .)q is the mean with respect to g{q). We can now 
set \X\ = V — {pk} and fix q so to fulfill a > q. Taking 



/ n \i/("-'2) 



(2.33) 



we obtain the probability theory variant of (2.29), namely 

n 

Y.iPkT - 1 < sup(P"-«) P[P > a] Z{q) 



< P[V > a] Z{q) . 



(2.34) 



To proceed we realize that for q € [0, 1] we have 1 < 
Z{q) < n^~^ and hence 



1 > a > 



(l-9)/(a-g) 



(2.35) 



Note particularly that {1 — q)/{a — q) > 1. Thus if for 
most of i's the inequality Pi < (l/n)(i-«)/("-«) holds 
(rare events) then PIV > a] of (2.34) can be made ar- 
bitrarily small^. Besides, because Z{q) is bounded by 
fi^-Q irrespective of a particular choice of V and a we 
may use this freedom to fix RHS of (2.34) to be very 
small. So for example when most pi ^ 1/ then the 
choice q — 1/2 and a = 3/4 assure that Z{q) < ^Jri. while 
P[P > a] « 1/n and hence RHS of (2.34) is smaller than 
\l \fn. It should be recognized that in this case the in- 
equality (2.29) holds not because a^l but because n is 
large. 

It is interesting to consider now the situation when 
1(1 — a)Sa\ > 1- Such a case is undoubtedly more in- 
triguing than the previous one as it represents a wider 
class of physically relevant situations. Let us start first 
with the situation |(1 — a)Sa\ ~ 1. There are two cases 
of interest here. The case when (1 — a)Sa w 1 is the 
simpler one. Here a < 1 due to positivity of Sa and we 
may rewrite (2.27) as 

rl/il-a) .S^ \ ^ 

+ / \dx 



( /-i/Ci-") [S^ 
\Jo Jl/{1 



/(1-a)/ l-ha;(l-a) 



k ^S^-l/{l-a) 



(1-a) ■ 2 

[(1 - n)5„ - 1]2 



+ 



{1-a) 



So. . 1 fj^ I 



2 (1-a) V 2 



(2.36) 



^Of course, due to normalization condition Pt = li 

P[P > a] cannot be zero since there must be always a very 
small probability for large (i.e., > 1/n) pi's. Hence name large 
deviations. 



On the other hand, the case when (1 — a)Sa ~ — 1 is very 
important as it corresponds to the large a limit. Since 
for high a, Sa asymptotically approaches ( = [(f>fe)max ~ 
1]/(1 — a) from above we can write 




1 



JC I l + a;(l-a) 

alll(pfc)max Sa{l - g) + (1 - (Pfc)g,ax) 

tea. 

+ C([Ja+l0g2(pfc) max I ) 



■ + 



(1 + (Pfe)max["ln(pft)„iax - 1]) 



max 



(1 



a)bfe)Siax 



(2.37) 



In both previous cases we have seen that the leading or- 
ders yielded a linear relationship between Renyi's and 
THC entropy. As already recognized by Schrodinger [30] , 
statistical entropy is defined up to a linear transforma- 
tion. This, in turn, one could view as a conceptual back- 
ing for THC entropy in the respective situations. Ones 
pleasure is short-lived, however, when one starts to con- 
sider the case (1 — a)Sa ^ 1- This corresponds, for 
example, to the situation when a — > 0. Writing (2.27) as 



™l/(l-a) /.<S, 



dx 



1 



l+x{l- a) 



k 



(l-a)+„5^ ^^J,;a-ar V^l-") 



dx 



n=Q 

HSail-a)) 



n+l 



(1 — a) 5a (1 — a) 



(2.38) 



we see that there is a logarithmic singularity at large Sa ■ 
Hence, no linear mapping between RHC and Renyi's en- 
tropy exists in this region. One may thus expect that for 
(1 — a)Sa ^ 1 both entropies have qualitatively different 
behavior and the conceptual grounding for THC entropy 
must be sought out of the scope of information theory. 

Let us add two more comments. It is often argued 
that concavity of THC entropy with respect to probabil- 
ity distribution makes it better suited, say, for thermo- 
dynamic considerations. It is, however, concavity with 
respect to extensive variables rather than probability dis- 
tribution which ensures stability of thermodynamic equi- 
librium [14]. The first does not necessarily implies the 
second. Needless to say that there is no general concav- 
ity requirement for entropy in non cquilibriiim systems. 
Secondly, from Eq.(2.27) we sec that THC entropy and 
Renyi's entropy arc monotonic functions of each other 
and, as a result, both must be maximized by the same 
probability distribution. However, while Renyi's entropy 
is additive, THC entropy is not, so that it appears that 
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the additivity property is not important for entropies re- 
quired for maximization purposes. 

III. RENYI'S ENTROPY OF CONTINUOUS 
PROBABILITY DISTRIBUTIONS 

While in the previous section we dealt with the Renyi's 
entropy of discrete probability distributions we will now 
discuss the corresponding continuous counterpart. We 
shall see that in the latter case a host of new properties 
will emerge. As a byproduct we get a consistent extension 
of THC entropy for continuous distributions. 

Let us first assume that T{x) is an arbitrary continu- 
ous, positive density function (PDF) defined, say, in the 
interval [0, 1] . By defining the integrated probability 

Ak+l)/n 

Pnk = / dxT{x); fc = 0, 1, . . . ,n - 1 , 

Jk/n 

we generate the discrete distribution Vn = {Pnfe}- It 
might be then shown [3,4] that 

Ta{T) = lim {loi{Vn) - log2 u) 

n — ^■oo 

= T3^^°S2(_^''^3:.F"(x)) , (3.1) 

provided that f^dxJ^"{x) exists^. Here log2n must 
be subtracted to ensure a correct measure in the inte- 
gral. Defining the uniform distribution = {^i ■ • ■ j 
then log2 n = Xq,(£„). From this we may interpret 
(Vn) as the gain of information 
obtained by replacing the uniform distribution £„ (hav- 
ing maximal uncertainty) by distribution or, in other 
words, —Ia{^) represents the decrease of uncertainty 
when £n is replaced by Vn- In the case of Shannon's 
entropy the quantity —1-L{T) is usually called the infor- 
mative content or "negentropy" and states how much un- 
certainty is still left unresolved after a measurement (for 
discussion see e.g., [33,34]). 

Relation (3.1) can be viewed as a renormalized Renyi's 
information content. This may be understood from the 
asymptotic expansion of Xq,('P„), namely 

'^ai'Pn) = divergent in n + finite + o(l) , (3.2) 

the o(l) symbol means that the residual error tends to 
for n — > 00. The finite part (= Ta{^)) is fixed by 
requirement (or by renormalization prescription) that it 
should fulfill the postulate of additivity in order to be 



^For < a < 1 this is always the case as X)fc(Pn*:)" < 



identifiable with an information measure. Incidentally, 
the latter uniquely identifies the divergent part as log2 n. 
The above renormalization procedure is somehow analo- 
gous to that in quantum field theory where one renormal- 
izes energy by subtracting the ground state contribution. 
It should be, however, noted that the information log2 n 
is usually greater than laij^nk) and consequently 1a{^) 
is not positive. The former should be contrasted with the 
discrete case where la is by construction non-negative. 

Extension of (3.1) into rf-dimensional situations is 
straightforward. Having a d-dimensional random vari- 
able (i.e., experiment) A^'^^ we can discretize it in the fol- 
lowing way; = (Mil, . . . , hM^^ where [. . .] 

denotes integral part. This divides the d dimensional 
volume V of the outcome (or sample) space into boxes 
labelled by an index k which runs from 1 up to [yn"^]. 
The size of the /cth box is Z = 1/n and its probability 
distribution V!^^ = {p^i} generated via prescription 

Pnk= I rf''x;r(x); k = l,2,...,[Vn'']. 

J kth box 

It can be shown then (see e.g., [3] and Appendix D) that 
lim(I„(P<'")-dlog2n) 

n^oo 

= (T^ (//'^-^"W) ' ^'-'^ 

provided that (i''xJ^"(x) exists. 

Question now stands whether we get unique 2^\J^) 
by mimicking the previous recipe, i.e., performing the 

asymptotic expansion of laiV^f^) and pinpointing the 
correct finite part by the renormalization condition - ad- 
ditivity of information. In the non-unit volume, however, 
one more fixing condition is required. To see that we de- 
fine the uniform distribution E^f'' = | , • • • , y\d | 
with Vn = -^^^ "ji^ y _ j^cjiyi's entropy then reads 

Jc«(4") =l0g2K^+dl0g2n, 

and so 

lim(J„(pr)-^a(4'0) 

n— '■oo 

= ii^)'^^^[ud^.i/v^)- ^'-'^ 

Alike in (3.3) the RHS of (3.4) represents the finite part 
in the asymptotic expansion of lai'Pn''), the part which 
fulfils the additivity of information condition. To ensure 
the uniqueness of Rcnyi entropy in the case of continu- 
ous distributions we must, in addition, fix the value of 
the finite part at = (l/V). It is then matter of taste 
and/or a particular problem at hand which convention 
should be used. In this paper we will use the renormal- 
ization prescription where I^\l/V)\ finite = log2 ^ (i-^-, 
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the one which imphcs Eq.(3.3)). The latter merely means 
that we define Rcnyi'c entropy with PDF T as 



5a (P^) 



1 



J<r>(J-)^ lim(I„(n'^')-J„(fr)|y=i) 



(3.5) 



In Section IV we generalize results (3.4) and (3.5) into 
fractal and multifractal systems. A comment is in order. 
It may be shown (see Appendix E) that the form (3.4) is, 
in fact, a better candidate for the information measure 
than (3.3) as it is an invariant under a transformation of 
A'-'^^ . However, difference between (3.3) and (3.4) is often 
only a constant which ensures that for the questions we 
address here it is quite adequate to use the simpler form 
(3.3). It should be, however, clear that there are system 
of physical interest where the ground-state entropy plays 
a central role (e.g., frustrated spin systems or quantum 
liquids). In such cases the form (3.4) is obligatory. 

Let us now examine the implications of (3.1)-(3.4) for 
THC entropy with continuous distributions. For this 
we will use the convention introduced before Eq.(3.3). 
Firstly, from (2.27) and (3.3) follows that [2'a(:P„) - 
rflog2n] is finite at large n (provided J^rf'*x^"(x) ex- 
ists) and so 



{\-a)S^{Vn) 



- = [ d<*xjr"(x) + o(l). (3.6) 
Jv 



In order to obtain the correct THC entropy with PDF 

it is conceptually simplest to follow the same route as 
before, i.e., asymptotically expand 5a(7'„)/n''*^^~"^ and 
look for the finite part which conforms to certain renor- 
malization prescription^. Unlike the Renyi entropy case 
we do not have now any first principle renormalization 
prescription (d la additivity of information) which we 
could impose. As a matter of fact, one could be tempted 
to use the THC pseudo-additivity condition to isolate 
the proper finite part in the SaiVn) / n'^^^~°'^ expansion, 
but such a renormalization condition would be clearly 
ad hoc as there is no a priori reason to assume that the 
non-extensivity condition obeys the same prescription in 
the continuous case. It is fairly safer to follow the anal- 
ogy with Eqs.(3.4) and (3.5) demanding, for instance, the 
consistency for a's in the complex vicinity of a = 1 (i.e., 
values at which Renyi and THC entropies coincide). If 
the consistency is reached then the validity of the result 
can be analytically continued to the whole domain of an- 
alyticity of - so particularly to a e M"*". 

Using the asymptotic expansions: 



^It is indeed >Sc«(Pn)/n'*^^""' rather than <Sc<(P„) which 
should be asymptotically expanded. For instance, for < 
a < 1 the asymptotic expansion of SaiVn) would be o(l) 
and so the corresponding large n limit would be trivial. It is 
not difficult to see that it is only the fraction 5a(Pn)/n'*^^~"^ 
which has a senseful meaning in the large n limit. 



nd(i-a) (1 - a)n'^(i-") 
+ 



(1 



^ / d'^xJ-"(x)+o(l), 
-a) Jv 



1 



jid(i-a) (1 - a)n''(i-") 

+ (r^//'^i/^"+^«' ^'-'^ 

we may immediately write 



S^\T)^ lim 



n— '■oo \ Tl 



(1 



1^ 

rf^xl/y" - 1 ) , 



1 



„d(l-a) 



S^„^\J^)= lim , , 



(/.''"^"W-O- 

It is not difficult to check that for |a| G [1 — e, 1+ e] , £ <C 
1, (3.8) is consistent with (3.4) and (3.5). 

Let us note at the end that from the asymptotic ex- 
pansion of Ia{Pn^) i.e., from 

J„(pr ) = rflog2 n + J<<*' {J^) + 0(1) , (3.9) 

we find, in return, that the dimension d is identified with 



d{a) = lim ' 

n->oo log2 n 



(3.10) 



For simple metric (outcome) spaces (like M.'^) we will 
prove in the following section that d{a) = d for all a 
and it coincides with the usual topological dimension. 
This situation is however not generic. In the next sec- 
tion we shall see what modifications should be done when 
(multi)fractal systems are in question. 



IV. RENYI'S PARAMETER AND 
(MULTI)FRACTAL DIMENSION 

Fractals, objects with a generally non-integer dimen- 
sion exhibiting the scaling property and property of self- 
similarity have had a significant impact not only on math- 
ematics but also on such distinctive fields as physical 
chemistry, astrophysics, physiology, and fluid mechan- 
ics. The key characteristic of fractals is fractal dimension 
which is defined as follows: Consider a set M embedded 
in a d-dimensional space. Let us cover the set with a 
mesh of d-dimensional cubes of size l"^ and let Ni{M) is 
a minimal number of the cubes needed for the covering. 
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The fractal dimension (or similarity dimension) of M is 
then defined as [35,36] 



D 



In iV((M) 
hm — — - 
;->o In / 



(4.1) 



In most cases of interest the fractal dimension (4.1) co- 
incides with the Hausdorff-Besicovich fractal dimension 
used by Mandelbrot [35]. 

Multifractals, on the other hand, are related to the 
study of a distribution of physical or other quantities on 
a generic support (be it or not fractal) and thus provide 
a move from the geometry of sets as such to geometric 
properties of distributions. An intuitive picture about an 
inner structure of multifractals is obtained by introduc- 
ing the /(a) spectrum [5,37]. To elucidate the latter let 
us suppose that over some support (usually a subset of 
a metric space) is distributed a probability of a certain 
phenomenon, be it e.g., probability of electric charge, 
magnetic momenta, hydrodynamic vorticity or mass. If 
we cover the support with boxes of size I and denote the 
integrated probability in the ith box as pi, we may define 
the local scaling exponent by 



(4.2) 



where Ui is called the Lipshitz-Holder exponent. Here 

and throughout the symbol ^ indicates an asymptotic 
relation, e.g., (4.2) should read: 

In Pi{l) 
tti = hm — — — . 
/-►o m I 

The proportionality constant (say c{ai)) in (4.2) can be 
weakly dependent on L By "weakly" we mean that 

limi^^M=o. 

i->o In / 

Note that PDF of each of small pieces is 

Pi . lai-d 



Pi = 



Id 



(4.3) 



and so a, controls the singularity of pi. Inasmuch is 
also known as the singularity exponent. 

Counting number of boxes dN{a) where Pi has singu- 
larity exponent between a and a + da, then /(a) defines 
the fractal dimension of the set of boxes with the singu- 
larity exponent a by 



dN{a) ~ l-f^^Ua. 



(4.4) 



Here /(a) is called singularity spectrum. Multifrac- 
tal can be then viewed as the ensemble of intertwined 
(uni)fractals each with its own fractal dimension /(a). 
So f{a) describes how densely the subsystems with the 
singularity exponent a are distributed. It should be noted 
that power law behaviors (4.2) and (4.4) are the funda- 
mental assumptions of the multifractal analysis. 



The convenient way how to keep track with piS is to 
examine the scaling of the corresponding moments. For 
this purpose one can define a "partition function" as 

Z{q) = J^Pt = / ^« n(a)/-^>)/«" , (4.5) 

i ■' 

(n(a) is (weakly / dependent) proportionality function 
having its origin in relations (4.2) and (4.4)). In the small 
/ case the asymptotic behavior of the partition function 
can be evaluated by the method of steepest descents. As 
a result we get the scaling 



z{q) ~ r 



(4.6) 



with 



T{q) = mm(qa - f{a)) = qaaiq) - f{ao{q)) , 

a 

^fiaoiq))=q and ao(g)=T'(g). (4.7) 

These are precisely the Legendre transform relations. 
Scaling function T{q) is called correlation exponent or 
mass exponent of the qth order. So for the purpose of 
multifractal description we may use either of the conju- 
gated couples /(ao), ao or r(g), q. For the future reference 
we will need to know that t(0) = —D and t(1) = (see 
e.g., [35]). Let us finally stress that if not stated other- 
wise, we will often "abuse" notation and write simply a 
instead of ao- 



A. Generalization of Eqs.(3.4) and (3.5) to fractal 
sample spaces and multifractals 

With the definitions of (multi)fractal dimensions at 
hand we may now generalize Eqs.(3.4) and (3.5). Let us 
assume first that we have a fractal support M on which is 
defined a continuous PDF J^{x). Following the renormal- 
ization prescription of Section III we know that in order 
to obtain the renormalized Renyi's entropy we have to 
know Xo,[£n). This can be done by realizing that the 

uniform distribution is now £„ = Here 

Ni is the minimal covering (with cubes of size l"^) of the 

fractal set in question and n = l/l. Due to scaling law 
(4.1) the (pre)fractal volume Vj = Nil^ converges to the 
actual (finite) fractal volume V in the Z — > limit. As a 

result £n = • • • > tt}' ^■nd hence 



1a{£n) = l0g2 Vl- D l0g2 I . 



(4.8) 



In the n ^ oo (i.e., I 0) limit we prove in Appendix D 
that either 



J„(jr) = hm (J„(p„) - Je,(£:„)) 

l0g2 



1 . f !^dtx:F^{^) 



{I -a) 



(4.9) 
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or 



J„(jr)= lim (J„(p„)-J„(£:„)|y=i) 

= lim {I„{Vn) + Dlog^l) 



n— '■oo 

1 



l0g2 



M 



(4.10) 



in conformity with the chosen renormahzation prescrip- 
tion. The measure /x is the HausdorfF measure. Note 
that the RHS's of (4.9) and (4.10) are finite provided the 
integral J^^ (i/ijF"(x) exists. From (4.10) the asymptotic 
expansion (3.9) for 2a i'Pn) reads 

MVn)=Dlog^n + Ja{T)+o{l). (4.11) 

This means that d{a) defined in (3.10) boils down to 



d{a) = Urn 5^ = D . 

n^oo log2 n 



for V a . 



(4.12) 



We remark that the information measure D logj n 
appearing in (5.47) and (4.10) is nothing but an 
information-theoretical analogue of the Boltzmann en- 
tropy: S = ks^i^W {ks is the Boltzmann constant and 
W is the number of accessible microstates). This is so be- 
cause both Ta{£n) (= Ti{£n) for V a) and the Boltzmann 
entropy S describe systems where all possible outcomes 
(or accessible microstates) have assigned equal probabil- 
ities (constant PDF). Thus Ia{£n) alike S are both max- 
imal attainable entropies compatible with a given set of 
all possible outcomes (or accessible microstates). 

Foregoing analysis can be also utilized to multifractals. 
In fact, by employing the multifractal measure [36] 



Pnk 1^0 
Id 

feth box 



if < T{a) 
oo if d > r(a) , 



(4.13) 



we prove in Appendix F that 



2a(/Up) = lim (Jq(P„) -Xa(f„)|y=i) 



= lim [la{Vn) 



1 



loe 



(a-1) 
d/x^^ (a) 



log2 



(l-a) 

Eq.(4.14) implies the asymptotic expansion 

r(a) 



(4.14) 



log2n+J„(/ip) + o(l). (4.15) 



Consequently we note that d{a) of (3.10) reads 

,/ ^ laiVn) T{a) 

d{a) = hm = . 

ra^oo log2 n (a — 1) 



(4.16) 



Unlike in fractal sample spaces, in multifractals d{a) de- 
pends on a. Note that in the case of smooth PDF's 



the integrated probability pi(^) scales as Z^*-"^ and so we 
have a unifractal characterized by a single dimension 
a = f{a) = D. This implies that r/(a — 1) = D and 
hence for smooth PDF's we naturally recover the result 
(4.12). It should be emphasized that when the outcome 
space is a simple metric space (like W^) then it is known 
that the fractal dimension D coincides with the usual 
topological dimension [35,36] and so, for instance, D = d 
in the case of R''. 



B. Generalized dimensions and reconstruction 
theorem 

After this brief intermezzo we now turn back to the 
question whether there is any connection of Renyi's en- 
tropy with (multi) fractal systems. At present it seems to 
us that there are at least two such connections. The first, 
more formal connection, is associated with the so called 
generalized dimensions of the qth order defined as: 



lim 

1^0 



1 



In Zn 



(5-1) Inl 



(9-1) 



(4.17) 



In passing the reader should notice that Vq is nothing but 
d{a = q) introduced in (4.16). A complete knowledge of 
the collection of generalized dimensions T>q is equivalent 
to a complete physical characterization of the fractal [39] . 
It should be noted in this connection that the fractal 
dimension, the information dimension and the correla- 
tion dimension (all frequently used in the deterministic 
chaotic systems [40]) are, respectively Vq, Vx and Vi. 
In fact, all Dq are necessary to describe uniquely general 
fractals e.g., strange attractors [39]. This is analogous to 
statistical physics where one needs all cumulants to get 
the full density matrix. Mathematically this corresponds 
to Hausdorff's moment problem [41]. 

While the proof in [39] is based on a rather complicated 
self-similarity argumentation we can understand the core 
of this assertion using a different angle of view. In fact, 
employing the information theory we will show that the 
assumption of a self-similarity is not really fundamental 
and that the conclusion of [39] has more general appli- 
cability. For this purpose let us define the information- 
distribution function of V (see e.g., [4]) as 



— logj Pk<X 



(4.18) 



The latter represents the total probability carried out by 
events with information contents Xk = —\0g2Pk < x. 
Note also that for a; < the sum in (4.18) is empty and 
so T-p{x) = 0. Realizing that 

2(i-")-d^p(x) « 2(i-")^'=p, = Pk' 

x<Xk<x-\-dx x<Xk<x-\-dx 
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we may write 

Ia(P) = ^^4^1og2(^^"^ 2(i-«)Wp(x)) . (4.19) 

The former integral should be understood in the Stielt- 
jes sense (jF-p(a;) is generally discontinuous). Taking the 
inverse Laplace-Stiltjes transform of (4.19) we obtain 



2 f'lOO-\-C 

^v{x) = TT^ dp 

=y— I 

Y J- 



P 



!oo+0+ gp(a;+log2P!) 

dp 

icx>+0+ P 



(4.20) 



with p= (a — l)ln2. The constant a is dictated by re- 
quirements that it should be positive and that all singu- 

larities of should lie to the left of the vertical line 

p 

5i(p) = 0- in the complex p-plane. As e~P-^°' is basically 



Sfc^'fc means that 



is analytic on the half-plane 



{p |3?(p) > 0}. As a result we may choose cr = 0+. For 
[x + logjPfe) < we may close the contour by a semi- 
circle in the right half of the plane. In this region inte- 
grand is analytic and so T-p{x) = as it should be. For 
{x + log2Pk) > 0, the semicircle must be placed in the left 
half plane, which yields then correct T-p{x) of Eq.(4.18). 

Disadvantage of the inverse formula (4.20) is that p 
(and so a) gets its values from C, or more specifically, 
one needs (at best) all complex p's belonging to the small 
circle around p = to reconstruct the underlying distri- 
bution. It is however clear that in order to determine 
how many a's are really needed to fully reconstruct V 
one must resort to the real inverse Laplace transform in- 
stead. Such a reversal indeed exists and is provided by, 
the so called, Widder-Stieltjes inverse formula [41]: 



A 

E 

n=0 



(-- 

V X 



n! 



( - - 2:(A/ln(2)x-|-i; 



in) 



or (after setting — = z) 



A 



12—^ [exp(-2X(^/in(2)+i))]^"^ 



n=0 



(4.21) 

here A is a regulator which has to be set to -|-od at the 
end of calculations. It is important to recognize that the 
RHS of (4.21) depends on all a G [l,oo). Other, more 
intuitive, proof of the same fact is provided in Appendix 
G. In addition, in Appendix H we show that a similar 
"reconstruction" theorem holds also for THS entropy Sa- 

As a result, when working with of different orders 
we receive more information than restricting our consid- 
eration to only one a. In this connection it is illuminating 
to rewrite the complex integral in (4.20) as 



ioo+0+ QP{x+\og2Pk) 



dp 



ioo-t-O-i- 



PP 



/oo gip{x+log2 Pk) 
dp h in . (4.22) 
-OO 



P 



Here PP stands for the principal part (associated to the 
pole at p = 0). The term in is the sole contribution from 
p = (i.e., a = 1), while PP(. . .) part corresponds to the 
contribution from the (imaginary axis) neighborhood of 
p = 0. In the case when {x + log2Pk) > then PP(. . .) = 
in and when {x + logjP^) < then PP(. . .) = —in, so 
the a = 1 contribution has precisely 50% dominance. It 
should be also realized that PP(. . .) is ruled for most p^s 
by p's from the close proximity of p = 0. In fact, 

/oo gjp(x + log2 pfc) 
dp 
-oo P 

/S gip{x+log2 Pk) 
dp 2i si(Sy) 
-5 P 
fS „ip{x+log2 Pk) 
« PP / dp - 



-5 



P 



+ 2is{y) {n/2 - 5\y\ + 0{{S\y\ f)) , (4.23) 

with 5 being the 5 neighborhood of p = 0, si(a;) being the 
sine integral and y — {x + log2Pfc). Hence we see that 
when the outcome space is a discrete set we need gener- 
ally all Iq's with a S [l,oo) to determine V albeit the 
most dominant contribution comes from the relatively 
small neighborhood of Ii = Ti. The latter statement is 
the discrete-space variant of the conclusion in [39] . 

Let us now briefly comment on the reconstruction 
theorem for the cases when the outcome space is a d- 
dimensional subset of R'^. By covering the subset with 
the mesh of cZ-dimensional cubes of size l'^ = we 
obtain similarly as in Section III the integrated distribu- 
tions Vn = {Pnk} and £n = {£nk}- The corresponding 
information-distribution function now reads 

-^■Pn/fn (^) = E {Pnk/Snk) I ^ {Pnk/£nk) 

-log2iPnk/£nk)< X k 



E iPnk/£nk) (4.24) 



-loS2iPnk/£nk)< X 



This implies (for ^ = 1) that 



I x——d log2 n 

and so in accord with (3.3) 



oil— Qja: j'T* / ^\ YlikPuk 

2_ dJ^-Pn/SnW = ^ oa 

nk 



U - \Jx=-d\og2n 

Io.{T)= lim I'fiT). 



(4.25) 
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Using the Widder-Stiltjes inverse formula we may re- 
create T-p^j£^(x) (and hence T) in terms of xi"^(^)'s. 
But the important moral here is that in the continuous 
limit (large n) a; G (— c», cxd) and so a € (—00, 00). Unlike 
in discrete sample spaces, all Xa, including those with 
a < 1, are needed now to pinpoint the underlying PDF. 

It should be born in mind that from a purely mathe- 
matical point of view the reconstruction procedure pre- 
sented here is by no means the proof which extends easily 
to (multi)fractal systems - there is now obvious analogue 
of the Widder-Stiltjes inverse formula there. It should be 
rather taken as an indication that in general systems all 
Xa with a G (—00,00) are needed to determine uniquely 
the probability distribution. This is basically a weak ver- 
sion of the celebrated moment problem of Hausdorff [41] . 
The latter resonates with the finding that for determinis- 
tic chaotic systems the multifractal scaling function T(g) 
often exists even for negative values of q. In those cases 
the partition function (4.5) is dominated by very small 
values of Pi. Hence one may be skeptical about the real 
existence of such a negative-q scaling behavior since the 
latter can be easily disrupted by fluctuations. In fact, if 
we explore the stability of Renyi's entropy for negative a 
by adding a small imaginary part into a we obtain Fig.l. 




1 



FIG. 1. A plot of Renyi's entropy Ta{V) for 2 dimensional 
T = (pi,p2) = (p, 1 — p). We choose p = 0.01. 

As p goes closer to zero there is a violent proliferation of 
branch cuts in the left half of the complex a-plane. So 
information conveyed by Renyi's entropy with negative 
a starts to be highly unreliable. Because Renyi's entropy 
is connected with the generalized dimensions via relation 
(4.17) such a breakdown of scaling for negative g's (and 
hence a's) should be inevitable in various deterministic 
chaotic systems. This is indeed the case, see e.g., [46]. 



The former reasonings may, to a certain extent, vindi- 
cate the use of a > in usual information theory. The 
bound a > can be hence merely understood as a relia- 
bility bound imposed on the conveyed information. 

C. Thermodynamic formalism and MaxEnt 

The second connection which we intend to advocate 
and progress here is the connection with the maxi- 
mal entropy principle (MaxEnt). We will show that 
from the MaxEnt point of view, extremizing Shan- 
non's entropy on (multi)fractals is equivalent to ex- 
tremizing directly Renyi's entropy without invoking the 
(multi) fractal structure explicitly. An explicit illustra- 
tion of this point on the network of cosmic strings will 
be given elsewhere. 

Consider a support paved with boxes of size I and let 
the integrated probability in the fcth box is denoted as 
Pk- Shannon's entropy of such a process is then 

^ = - ^Pkil)log2Pk{l) 

k 

The important observation of the multifractal theory is 
that ior q = 1 

a{l) = ^ = Km . (4.26) 

dq 1^0 log2^ 

It can be shown that the number a(l) = /(a(l)) de- 
scribes the Hausdorff-Besicovich dimension of the set on 
which the probability is concentrated (see e.g, [36]). This 
means that the probability distribution Vn is cumulated 
on the ^-mesh cubes with Pk{l) ^ In fact, the rela- 

tive probability of the complement set approaches zero in 
the I limit [36]. This statement goes also under the 
name Billingsley theorem [42] or curdling [35]. The cor- 
responding subset M is known as the measure theoretic 
support. Let us thus write 

dniM) ^ f{a{l)) = lini J—J2p,(l)log^p,(l) 

1^0 log2 I ^ 

« 5IP'=(^)log2Pfe(£)- (4.27) 

Here e corresponds to a cutoff (or coarse graining) scale 
of the grid. For the further convenience we will keep 
e = 'cut finite throughout all our calculations and set 
£ — > only at the end. 

In the case of multifractal systems one is often inter- 
ested in entropy of only certain (uni)fractal subsets. For 
such a purpose it is useful to introduce a one-parametric 
family of normalized distributions (zooming or escort dis- 
tributions) Q{q) as 

o ia I) - t^'"^^^]'^ 11''^-^ - if^'^^) 
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Because the distribution Q{q, I) alters the scahng of the 
original distribution Vm the corresponding measure theo- 
retic support will change. As a mater of fact, distribution 
I) enables to form an ensemble of measure theoretic 
supports A^*^"?) parametrized by q. Parameter q provides 
a "zoom in" mechanism to probe various regions of a dif- 
ferent singularity exponent. Indeed, from (4.7) we have 



df{a) 



< da if q < 1 
> da if q > I . 



(4.28) 



Integrating (4.28) from a{q = 1) to a wc obtain 

and so for q > 1 g{q) puts emphasis on the more singular 
regions of Vm while for q < 1 the accentuation is on the 
less singular regions (see also Fig. 2). The corresponding 
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FIG. 2. A plot of the zooming distribution for 2 dimen- 
sional V: Q{q) = p" / {p'^ + {I - pY) ■ 



fractal dimension of the measure theoretic support 
of Q(q) is 

(>[(«)) = hm ^^^(9' log2 Qk{q, I) 

1^0 10g2 t ^ 

~ r~ a (g, e) log2 Qk{q,e). (4.30) 



We can now use (4.30) to find the promised connection 
between multifractals and Renyi's entropy. To do this 
let us observe that the curdling (4.30) mimics the situa- 
tion occurring in equilibrium statistical physics. There in 
canonical formalism one works with (usually infinite) en- 
semble of identical macroscopic systems with all possible 
energy configurations. Notwistanding only the configu- 
rations with Ei = {E) dominate in thermodynamic limit. 



In fact, defining the "microcanonical" partition function 
Zmic = ^ ^ \ ^ dN{ai) , 

\ak^{ai.ai-\-dai) j 

one gets for a., k. log2(pi)/ log2 e (c-f-, (4.2)) 



E 



Qfc £ (fii .ai-\-dai ) 



(/(«)). 



E 



/(«fc) 



/(aO . (4.31) 



Because in the micro-canonical approach the distribu- 
tion is uniform (£(ai) = {l/dN{ai)}), the correspond- 
ing Shannon-Gibbs entropy boils down to the micro- 
canonical (or Boltzmann) entropy 

n{£{a,)) = log2 dN{a.,) = log2 Z™,e , (4.32) 

and hence 



log2e 



-(/(«)). 



(4.33) 



Interpreting Ei = —ai\og2e as "energy" we may define 
the "inverse temperature" 1/T — /3/ln2 (note that fcs = 
1/ In 2 here) as 



' dE 



1 



dZ„ 



E=Ei 



Ine Z„ 



da.; 



^fia,)=q. 



Legendre transform then allows to determine the conju- 
gate function T{q) via 



{f{a))r 



q{a) 



r{q). 



(4.34) 



On the other hand, defining the "canonical" partition 
function as 



(where the identifications /3 = gin 2 and Ei 
— log2(pi(e)) are made) the corresponding means are 



a{q) = (a) 



E- 



-l3Ei 



_ Ei ft(g,£)log2 P»(£) 
log2 e 

/(g) ^ (/(«))_ = ^:P^e-''^- 



log2e 



(4.35) 



Let us observe two things. Firstly, the fractal dimen- 
sion of the measure theoretic support dn [M ) is simply 
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f{q). If q is a solution of the equation Oi = T'{q) then in 
the "thermodynamic" hmit {e 0) we can identify 



a{q) = {a)can 



\CL) niic ~ 

= (/(a))m. 



(4.36) 



Eqs.(4.35) then provide a parametric relationship be- 
tween f{q) and the singularity exponent a{q). When the 
parameter q is eliminated one recovers the usual singu- 
larity spectrum /(a). Eqs.(4.35) imply that {f)can = 
q{(i)can — {o)can = dr/dq, and so again the Legen- 
dre transform applies. Secondly, because the micro- 
canonical and canonical entropies coincide in the ther- 
modynamic limit 



n{£ia)) Bf^^l^ ^) log2 Qk{q, e) = U{Vn)\ 



f{i) 



Here we have used the subscript f{q) to emphasize that 

the Shannon entropy 'H{Vn) is basically the entropy of an 
unifractal specified by the fractal dimension /(g) defined 
in (4.35). Because of relations (4.36) and the Legendre 
transform (4.7) we obtain after a short algebra 



+ / 



+ 



- q 



log2 s q-1 

(a - (a)can) 



(f- 



1 



(4.37) 



with q determined by the condition T'{q) = a and 



log2EiPi(£) 



log2 £ logs £ 

Applying FHospital's rule we find that 

(f-r)- 



lim 

£^0 



(a - (a) can) + 



1 



log2 e = . 



(4.38) 



Multiplying (4.37) by log2 e, taking the small e limit and 
employing the renormalization prescriptions (4.10) and 
(4.14) we finally receive that 



'7-r n/r 

-^q - ^ I /(g) 



(4.39) 



The superscript r indicates the renormalized quantities. 
To understand (4.39) let us note that W(^n)|j(g) can be 
alternatively written as 



dN{a) 

k=l Ei=l Plis) 



Pkie] 



l0g2 



Pk{e) 



EtTp/(£). 



= log2 dN{a) 



(4.40) 



Denoting the incomplete distribution Efe=i°^Pfe(^) 
1 as iS and the conditional distribution {p]^{£)/S\ 
fc = 1, . . . , dN{a)) as V'^ then 



n{Vn 



1/(9) 



Efc=l''^Pfc(g)l0g2Pfc(g) 



l0g2 



(4.41) 



So the RHS of (4.39) equals to Shannon's information of 
an incomplete distribution [3,4] minus information corre- 
sponding to the total probability of the incomplete sys- 
tem (i.e., unifractal). 

In passing we can observe that for q = \ the LHS 
of (4.39) represents the Shannon entropy of the entire 
multifractal system, while the RHS stands for the Shan- 
non entropy of the imifractal with the fractal dimension 
a(l) = /(o(l)) = D. It is of course Billingsley's theorem 
which makes sure that both sides match in the continu- 
ous limit. Now, the passage from multifractals to single- 
dimensional statistical systems is done by assuming that 
the a-interval gets infinitesimally narrow and that PDF 
is smooth. In such a case both a and /(a) collapse to 
a = f{a) = D and q — f {a) = 1. So, for instance, 
for a statistical system with a smooth measure and the 
support space Eq.(4.39) constitutes a trivial identity. 
We believe that this is the primary reason why Shan- 
non's entropy plays such a predominant role in physics 
of single-dimensional sets. 

Let us make finally one more observation. If we apply 
the MaxEnt approach to a single unifractal (say that with 
the dimension /(g)) and try to infer the most probable 
incomplete distribution which complies with whatever 
macroscopic constraints we know about the unifractal 
subsystem, we have to look for a conditional extremum 
of Shannon's entropy This can be done, at 

least in principle, in two ways. We can either extrem- 
ize ''^(^n)|/(q) with the incomplete distribution keeping 



S fixed, or extremize TL{Vn 



1/(9) 



directly with respect 



to the zooming distribution g{q,e). The second way 
is often more manageable. As a result we obtain that 
the least biased incomplete probability distribution on 
the unifractal characterized by the dimension f{q) is ob- 
tained via extremizing Renyi's entropy Tq{Vn) with re- 
spect to the zooming distribution Q{q,e). So by chang- 
ing the q parameter at Renyi's entropy one can "skim 
over" all imifractal Shannon's entropies. If, additionally, 
the macroscopic constraints correspond to state variables 
then MaxEnt approach naturally allows for a thermody- 
namic description of multifractals. 



V. FINAL REMARKS 

It was the aim of this paper to present a self-contained 
discussion of Renyi's entropy. Apart from formal infor- 
mation theory aspects of Renyi's entropy we have stud- 
ied its bearing on various topics of current interest in 
physics. These include the THC non-extensive entropy. 
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fractal and multifractal systems. PDF reconstruction the- 
orem, chaotic dynamical systems and MaxEnt approach 
to thermodynamics. 

It should be noted that the thermodynamical or stati- 
cal concept of entropy, though deeply rooted in physics, 
is rigorously defined only for equilibrium systems or, at 
best, for adiabatically evolving systems. In fact, the 
very existence of the entropy in thermodynamics is at- 
tributed to Caratheodory's inaccessibility theorem [43] 
and the statistical interpretation behind the thermody- 
namical entropy is then usually provided via the ergodic 
hypothesis [14,44]. When one moves away from equilib- 
rium there are very few clues left of how one should pro- 
ceed to define entropy. In particular, there in no general 
concept of ergodicity which could come into our rescue. 
But just what is entropy then? It is frequently said that 
entropy is a measure of disorder, and while this needs 
many qualifications and clarifications it is generally be- 
lieved that this does represent something essential about 
it. Insistence on the former interpretation however natu- 
rally begs for an operational prescription. To tackle this 
issue we have resorted to information theory. Here dis- 
order is quantified in terms of missing information and 
the corresponding information entropy is a measure of 
our ignorance about a system in question. We feel that 
the latter is a natural and conceptually very clean exten- 
sion of the equilibrium concept of entropy. This might 
be further reinforced by the fact that the information 
entropy stands a full mathematical rigor. Actually, the 
information theory provides a whole hierarchy od infor- 
mation entropies each of which is compatible with basic 
axioms of information theory and theory of probability. 
Such information entropies are mutually distinguished by 
their order (Rcnyi's parameter). It is well known [32] 
that the information entropy of order 1 (Shannon's en- 
tropy) can successfully reproduce the usual equilibrium 
statistical physics and hence thermodynamics on a sim- 
ple metric spaces. It was one of the aims of this paper to 
show that when dealing with (multi)fractal systems one 
needs to use also information entropies of orders a ^ 1 
- Renyi entropies. In fact, because the concept of in- 
formation does not hinge on the notion of equilibrium 
or non-equilibrium, one may go even further and apply 
information entropies into various non-equilibrium situ- 
ations (for a = 1 case see e.g., [45] and citations therein). 

Because of this versatile nature of Renyi's entropy we 
are rather tempted to believe that THC entropy is only 
derived (i.e., not fundamental) concept in physics. We 
substantiate the latter by arguing that in certain in- 
stances - e.g., rare events systems - THS entropy is the 
leading order approximation to Renyi's entropy. In addi- 
tion, because Renyi's entropy is a monotonous function of 
THS entropy all stability conditions in thermodynamics 
are identical in both cases and so from thermodynam- 
ical point of view both entropies are indistinguishable. 
In those cases it is a matter of taste and/or technical 



convenience which one will be applied [6]. It should be 
also noted that in this light an apparent non extensivity 
of THS entropy could be possibly viewed as an artifi- 
cial (local) feature of much the same origin as is a non- 
periodicity of leading (i.e., local) contributions to (glob- 
ally) periodic functions. 

It should be, however, admitted that the authors see a 
possible loophole for THC entropy to play a more pivotal 
role - i.e., to be an autonomous (not derived) and concep- 
tually clean construct, similarly as, for example. Fisher's 
entropy"^ is. The loophole seem to be provided by the 
quantum non-locality. The point is that in order to ob- 
tain some breathing space for THC entropy some of the 
axioms of Renyi's entropy must be bypassed or at least 
soften. The authors feel that only plausible possibility is 
to violate the axiom 3 of Section HA with its additivity 
of independent information. In fact, we have derived the 
additivity of entropies for independent experiments with 
the hidden assumption that experiments are independent 
if (and only if) they are uncorrelated. In quantum me- 
chanics, however, the relationship between independent 
and uncorrelated is more delicate. At present it seems 
that the feasible mechanism which questions, although 
in a very subtle way, the equivalence between being in- 
dependent and being uncorrelated is attributed to the 
quantum non-locality and, in particular the quantum en- 
tanglement. Bohm - Aharonov effect, Berry phase, EPR 
paradox. Wheeler's delayed choice experiment or quan- 
tum teleportation being the most paramount examples 
of the aforementioned. Indeed, one can go even so far 
as to claim that because the whole Universe is inher- 
ently quantum correlated one should refrain from using 
Renyi's entropy altogether. Whether or not these ideas 
are viable and whether or not the affiliated entropy is 
connected with THC entropy remains yet to be seen. 

As we have shown Renyi's entropy has a build in pre- 
disposition to account for self-similar systems and so it 
naturally aspires to be an effective tool to describe phase 
transitions (both in equilibrium and non equilibrium). It 
is thus a challenging task to find some connection with 
such typical tools of critical phenomena physics as are 
conformal and renormalization groups. The latter could 
in turn bring about a better understanding of the role 
of a parameter for systems away from equilibrium. An 
interesting application of the former observation is in the 
cosmic string physics. In cosmology, unified gauge theo- 



Fisher's entropy (or information) is an important concept in 
parametric statistics as it represents a measure of the amount 
of information a given statistical sample contains about the 
parameter which parametrizes PDF. It is well known that 
there is and intimate connection between Fisher's and Shan- 
non's [49] (and Reny's [4]) entropy, yet both concepts are 
completely autonomous. 
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ries of particle interactions allow for a sequence of phase 
transitions in the very early universe some of which may 
lead to defect formation via the so called Kibble-Zurek 
mechanism [50]. Cosmic strings as the most pronounced 
example of such defects, could have important relevance 
on the large scale structure formation of the universe or 
on cosmic microwave background radiation anisotropics. 
In astrophysics, for instance, cosmic strings could play 
an important role in dynamics of neutron stars and in 
the galaxy astrophysics. In usual cases when the grand- 
canonical approach is applied it is argued that at the 
critical (phase transition) temperature at which strings 
tend to fragment into smallest allowed loops, while large 
loops become exponentially suppressed - i.e., at Hage- 
dorn temperature [51], the correspondence between the 
canonical and micro-canonical ensembles breaks down 
as the grand-canonical partition function diverges [52]. 
Various viewpoints with different remedies were lately 
proposed in the literature. It seems, however, that non 
of the treatments has accommodated the well known fact 
that the string state-space acquires approximately self- 
similar structure which is exact at critical temperature 
[51,52]. From this standpoint Rcny's statistics appears 
to be particularly suitable for generalization of the Hage- 
dorn theory as it could better grasp the vital features 
near the critical point. In addition, Renyi's theory can 
be applied to construct the generalized grand-canonical 
partition function for the string network. Our current re- 
sults suggest that the new phase transition temperature 
should be lower than the one predicted by Hagedorn's 
theory. It would be definitely interesting to exploit this 
further and contrast our way with the more customary 
conformal theory approach. Work along those lines is 
presently in progress [53]. 

Let us finally mention that because symmetry break- 
ing phase transitions with string-like defects occur in a 
variety of physical systems ranging from ^He and ^He 
superfluids to the early Universe, with superconductors 
and liquid crystals in between, one can hope that predic- 
tions based on Renyi's entropy could be directly tested 
in laboratory. In this connection, the analysis of vortex 
tangle [54] (turbulence of vortex loops in supcrfluid phase 
of ''He) is one such particularly promising systems with 
the room-size experimental setting, (see e.g., [55]). 
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APPENDIX A 

In this appendix we present an alternative way of find- 
ing the unique class of the Kolmogorov Nagumo func- 
tions. Let us start with Eq.(2.10) which we rewrite in 
the form 



fiCx) - a{x)f{{C - l)x) + fix) 



(5.1) 



with ^ being an arbitrary real constant {(^ > 0). The 
latter is equivalent to the equation 



(5.2) 



Note that when C ^ then /(O) = 0. The latter should 
be imposed as a boundary condition on prospective solu- 
tions. The solution of the functional equation (5.2) can 
be easily found, indeed realizing that functions fulfilling 
the scaling condition (5.2) obey the Euler-type equation 



d , a{x)\na{x) 
OX 1 — a{x) 

we may directly write that 



fix) 



f{x) = 7exp 



dx 



a(x) In a(x) 
xil — aix)) 



(5.3) 



(5.4) 



Shortly we will see that function (5.4) is the only one 
fulfilling the functional equation (5.1). Let us, however, 
first determine the function aix). Prom (5.1) follows that 



aix) = 



fiCx) - fix) 

fiic-m 



(5.5) 



Because the latter should be true for any ^ > we may 
safely assume that C = 1 + s/x with s being an infinites- 
imal. Then with a help of '1 Hospital rule we obtain 



aix) 



fix) 
f'iO) ' 



fix) = f 'iO) r dyaiy). (5.6) 
Jo 



Note that a(0) = 1. On the other hand (5.5) may be 
equivalently written as 



aiiC-l)x) 



fiCx)-fiiC-l)x) 

m 



(5.7) 



Taking now derivative d/di(—l), using (5.1) and setting 
successively ^ = 2 we get 

a'(.) = (a(.)-l)(ln/(.))' = ^M^, 

lna(a;) = cx . (5.8) 

If the integration constant c ^ then aix) = exp(ca;) 
and hence (see (5.4) and (5.6)) 



fix) = 7(exp(ca;) - 1) . 



(5.9) 
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In the latter the condition /(O) = was used. We have 
defined that 7 = /'(0)/c. In case that c = 0, we have 
from (5.8) that a{x) = const. = 1 and so 



/(x) = /'(0)x. 



(5.10) 



So we see that the compatible Kolmogorov- Nagumo 
functions are only linear and exponential ones. We 
should also note that the linear f{x) is retrieved from 
the exponential f{x) in the limit c 0. 

Let us now turn to the point of uniqueness of f(x). 
For that purpose let us assume that there arc two differ- 
ent functions fi{x) and f2{x) both fulfilling the equation 
(5.1) with an identical a{x) and arbitrary C > 0, i-S-, 



/i(Cx)=a(.T)A((C 
MCx) = a(.T)/2((C 



l)x)+/i(x), 
l)x)+Mx). 



(5.11) 



Because the latter should hold for any C > the following 
must be true 



a'{x) = {a{x)-l){lnh{x)y 
= {a{x)-l){lnf2ix)y 



(5.12) 



As a result we have that (ln/i(a;))' = (ln/2(a;))' and so 
fi{x) = const. X f2{x), which confirms that only linear 
and exponential functions are compatible with the addi- 
tivity of information. 



APPENDIX B 

Here we present a proof that the five postulates of Sec- 
tion IIA determine uniquely both Shannon's and Renyi's 
entropies. Our proof consists of four steps: 

a) Let us denote first I(l/n, . . . , 1/n) = £{n). Then 
from the second and fifth axiom follows that 

C{n) = J(l/n,...,l/n,0) 

< J(l/(n + 1), . . . , l/(n + 1)) = £(n + 1) , (5.13) 

i.e., £ is a non-decreasing function. 

b) To find the explicit form of C we employ the third 
postulate. For this purpose we will assume that we have 
m mutually independent experiments A^^\ . . . , A^"^^ 
each with r equally probable outcomes, so 



I{A^''^) = I{l/r, 1/r) = £{r) , (1 < fc < 



(5.14) 



Because experiments are independent X{A^''^A'^^^ = 
Af^) = liA'^'^y) ioYk^l and Vi, axiom 3 (generalized to 
the case of m experiments) implies that 



k=l 



' mC{r) . 



(5.15) 



On the other hand, the experiment A^^^f^A^'^'^f^. . .n^l^'"^ 
consists of r™ equally probable outcomes, and so 



£(r'") = mC{r) . 



(5.16) 



This is nothing but Cauchy's functional equation [13]. It 
might be shown [13,22] that for non-decreasing functions 
(5.16) has a unique solution; C{r) = Kln(r). The con- 
stant K can be determined from the axiom 2 which then 
directly implies that C{r) — log2(?'). 

c) Wc now determine X('P) using the axiom 3. To 
this extent wc will assume that the experiment A = 
(^1, ^2, • • • , ^n) is described by the distribution V = 
{pi,P2, ■ ■ ■ ,Pn} with Pfe (1 < /c < n) being rational num- 
bers, say 



9k \ - 

Pk = — , y^9k 

^ fc=i 



(5.17) 



Let us have further an experiment B = {81,62, . . . ,Bg) 
and let Q = {qi,q2, . . . ,qg\ is the associated distribu- 
tion. We split (^1,^2, . . . ,Bg) into n groups containing 
91,92, ■■■ ,gn events respectively. Consider now a partic- 
ular situation in which whenever event Ai in A happens 
then in B all the gk events of A;-th group occur with the 
equal probability 1/gk an all the other events in B have 
probability zero. Hence 

I{B\A = Ak) = I{l/9k, l/9k) = log2 9k , (5.18) 
and so 



X{B\A) = ek{a) filog^ gk)j • (5.19) 

On the other hand, X{A fl B) can be directly evaluated. 
Realizing that the joint probability distribution corre- 
sponding to ^ n S is 

^ = {rki =PkQi\k} 



= { 



Pi Pi P2 P2 Pn Pn. 

•)'''•) ) ) * * * ) ) * * * ) ? * * * ) J 

91 91 92 92 9n 9n 



92 X 



= • • • , i/g] 



(5.20) 



we obtain that I{A r\B) = £{g) = log2 g. Applying the 
axiom 3 then 



X{V) = log2 9 - r' efc(a)/(log2 5fc))j 

= l0g2 g - Qk{0i)f{\0g2Pk + l0g2 fl) j 

= C{g)-r'(Y.^Qk{a)f{\og^Pk+C{g))^ . (5.21) 
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Let us define fy{x) = f{-x - y) (=> / '^{x) + y = 
-f-\x)). Then 



C{g) 



\Qk{a)fc(g){Ik) 



(5.22) 



By axiom 4 f{x) is invertible in [0, oo) and so both fc{g) 
and /^(g-) are continuous on [0, oo). Applying now the 
postulate 1 (axiom of continuity) we may extend the re- 
sult (5.22) from rational pfc's to any real valued pfe's de- 
fined in [0,1]. 

Let us consider now the case of independent events 
(i.e., 1{B\A) = 1(8)). From Section IIA (and/or Ap- 
pendix A) we already know that in this case the only 
candidate for ,fc{g) is a linear function or a linear func- 
tion of an exponential function. Bearing in mind that two 
fmictions which are linear functions of each other give 
the same mean (sec Section IIA) we may choose either 
fc(g){x) = X or fc^g){x) = 2(^-1)^, X^l. Consequently 
from (5.22) we may write 



(5.23) 



It should be also noticed that from the axiom 5 follows 
that (a — A-l-1) > and a > 0. Within the scope 
of previous inequalities Eq.(5.23) is valid for any A. It 
should be particularly noticed that T{V) is continuous 
at A = 1 as both the left and right limit coincide. It 
can be easily checked that A = 1 corresponds precisely 
to the case of fc(g){x) = x. Quantity (5.23) was firstly 
proposed by Kapur [56] and named the entropy of order 
2 — A and type a. 

Finally, it should be born in mind that because the 
mean (5.19) is unchanged under linear transformation of 

fmiction ,f{x) we could, from the very beginning, restrict 
ourselves to only positive invertible functions on [0, oo). 

d) In the last step we will specify the relationship 
between a an A. Using the fact that the experiment 
AnB has the (joint) probability distribution TZ = {rki = 
Pfc9i|fc} we have 



and 



+ Tj TT log2 y^ Pk ^F^T 



a-\+l 



. (5.25) 



Eq.(5.25) is a result of the fact that 



2( 



and that fc{g){x) = 2(^-i)^ f{x) = 2(1"^)^. Com- 
bining the axiom 3 and Eqs.(5.24)-(5.25) we obtain for 
A 7^ 1 the identity 



EkPk^' EM 
EkPkEiimkY 



a-A-l-l 



,a-X+l 



kPk ■ 



(5.26) 



Introducing the random variable 

; 

we may equivalently rewrite (5.26) as 



E 



A+l //-|("-^) 



k,l ' kl 



^ (1/Q(«^))„ = (l/Q(«^))„_;,+i . (5.27) 
Here {■ ■ is defined with respect to the distribution 

I k,l 

Because pkS are arbitrary, equality (5.27) happens if and 
only if Q^"-^) is a constant [19]. The latter implies that 

E(*|fe)""^+' = const. , for Vfc and V^^i^ . (5.28) 
I 

It is easy to see that Eq.(5.28) is satisfied only when 
a = A. Substituting X = a into (5.23) we find 



J(P)=J(^) = ^l0g2EW 



(5.29) 



The proof for A = 1 follows the analogous route. This 
proves our assertion. 



APPENDIX C 

In this appendix we derive some basic properties of the 
information measure Ia{B\A). 
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From Appendix B wc know that f{x) compatible with 
axioms 1 5 is (up to a Unear combination) either x or 
2(i-a)x rp^^j^ I{B\A) appearing in the axiom 3 turns 
out to have the form 



(1-a) 



logs 



^kPk 



(5.30) 



with V{AnB) = {rki = Pkqi\k = qiPk\i}- We have rein- 
troduced the sub-index a to emphasize the parametric 
dependence of T. It results from (5.30) that for every a 



0<IUB\A)<log2 



(5.31) 



where n is the number of outcomes in the experiment B. 
Indeed, < la{B\A) holds due to a simple fact that for 
a fixed k and a > 1 



(realize that X^;Q';|fc = 1). Equality in (5.32) is clearly 
valid if and only if for any k there exists just one I = l{k) 
such that qi(^k)\k = 1 ^-^d otherwise. The latter means 
that outcomes of A uniquely determine outcomes of B 
and hence we do not learn any new information about B 
by knowing A. In such a case (5.30) gives Ia{B\A) = 0. 
This is what one would naturally expect from a condi- 
tional information. 

Similarly, for < a < 1 the reverse inequality in (5.32) 

holds and hence X];(^fe')" ^ Pk (former comments about 
the equality apply here as well) . This proves our assertion 
about the LHS inequality in (5.31). 

On the other hand, the RHS inequality in Eq.(5.31) 

holds because for a > 1, X^iCftlfe)" ^ convex function 
which has its minimmn at qu^ ~ 1/n (for V k). So 



l-Oi 



while for < a < 1 the opposite inequality holds. Thus 



Tcc{B\A) 



(1-a) 
< log2 n . 



I0g2 



^kPk 



(5.33) 



Inequality (5.33) may be viewed as a weak version of the 
well known a = 1 case where H{B\A) < H{B) with equal- 
ity if and only if B and A arc independent experiments 
[22] (i.e., knowing outcomes of A does not have any efi^ect 
on the distribution of outcomes of B). However, aforesaid 
does not generally hold for a ^ 1. This is because 



Ia{B)-Ia{B\A) 



and the identity 



(1-a) 



l0g2 



T,i,kiPkQi\ky 



(5.34) 



can be fulfilled for a 7^ 1 in numerous ways [26] without 
assuming that = qj. (for example, in the a = 2 case 
we may chose; V = {l/n}, Q = {l/n} and P(B|>1) = 
{1, 0, 0, . . . , 0}). However, in the limiting case a — > 1 
Eq.(5.35) turn out to be 



(5.36) 



which has the solution if and only if q^/. = qi, i.e., in 
the case of independent events [22]. Yet still, Ta{B\A), 
a 1 can be, in a sense, viewed as conditional informa- 
tion. This is so because when B and A are independent 
then from (5.34) follows that Ia{B) = Ia{B\A). Oppo- 
site implication, as we have seen, is not valid in general. 
The opposite implication is, however, valid when B has 
an equiprobable distribution. The latter is a simple con- 
sequence of Jensen's inequality because for a > 1 



and so for V{B) = Q = {qi = l/n} 

T,i.k€pt ^ Hi'ifE^.k^Ap 



T^kA^kiY 



< 



-'k\j I 



Y.i,k^i{Pk\ir 



J„(B)-J„(B|^)>0, 



(5.37) 



with equality if and only if the equality in Jensen's in- 
equality holds. This happens only when is a constant 
for VZ, i.e., when A and B are independent. Counterpart 
with < a < 1 can be proved in exactly the same way. 



APPENDIX D 



In this appendix we derive relations (3.4) and (3.5). 
We begin with the notion of the integration of continu- 
ous functions defined on fractal sets [47,48]. Consider a 
fractal set M embedded in a d-dimensional space. Let 
us cover the set with a mesh M^'^ of d-dimensional (dis- 
joint) cubes M^^ of size l'^ and let Ni{M) is a minimal 
number of the cubes needed for the covering. Functions 
with the support in the mesh are called simple if they 
can be decomposed in the following way: 



a(0(x)=^afxr'(x). 



Here xf^ are characteristic functions, i.e. 

f lifxeMf 
\0ifx^Mf. 

Then the integral Jj^ dfj, Q^^^ is defined as 



yZ^PkqiT) =y£^kqi\kT) ,(5.35) / d^(x)e«(x) = x;ei%«(Mf), 

k,l k,l Jm ■ 



(5.38) 



(5.39) 



(5.40) 
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where the measure /x*^'^ is the measure on the covering 
mesh. The precise form of the measure wiU be speci- 
fied shortly. On the covering mesh M^'' wc can build a 
cr-structure in a usual way. As a result, if 5 is a nonnega- 
tive /x^') measurable function then ^(x) = lim;^o ^/'•'H^) 
for all X e M^^\ for some sequence {Sf^} of monotonic 
increasing nonnegative simple functions. Owing to this 
fact we may define 

/ dM(x) a(x) = limToP M^'^mW) . (5.41) 

Jm i=i 

In this connection it is important to notice that due to 
the scaling prescription (4.1) 



log/^ -logiVi +o(?°) 



l^Ni = Vi-^V. (5.42) 



Here Vi is the pre-fractal volume which in the small I 
limit converges to the true fractal volume V. Natural 
candidate for ^) is the fraction V{M^^^)/Ni which 

in the small I limit behaves as^: = . So particu- 
larly when ^ is a continuous PDF we have 

/ dMx)^(x) = limV.F«Z^. (5.43) 



1=1 



The integrated probability of the fc-th cube is thus Pnk = 
J^'lpl^ . A simple consistency check can be demonstrated 
on Pnk = £nk- Indeed, from Section IV A we know that 

^nk = /Vi and so may write 

l = limg£:„. = limg^=y^rfM^ = l. (5.44) 

Wc thus see that the integral prescription (5.43) applies 
correctly in the case of uniform distributions. 

Using now the rcnormalization prescription (3.4) 

J«(^) = lim {lcc{Vn) - TociSn)) 



1 — a 



log2 



(5.45) 



If we use the rcnormalization prescription (3.5) (or equiv- 
alently when we set ^ = 1 for J„(^7i) in (5-45)) we easily 
see that 



^It should be noted that the measure just defined basically 
coincides with the D-dimensional Hausdorff measure. 



To.iT) = lim (laiVn) - TaiSn)\v=l) 

= lim(J„(P„) + £>log2 



1-a 



l0g2 



M 



(5.46) 



Our rcnormalization prescription is obviously consistent 
only when integrals on the RHS of (5.45) and (5.46) exist. 



APPENDIX E 

We show here that Rcnyi's entropy X'^^\T) is not in- 
variant under a transformation of the continuous random 
variable A'-''^ while I^\T) is. Note first that in a discrete 
case, outcomes Ai,. . . , An have the same probability dis- 
tribution pi, . . . ,p„ as outcomes h{Ai), . . . , h{An), where 
h(. . .) is an arbitrary "well behaved" function. Hence 
Renyi's entropy for such a system is invariant under the 
/i-transformation. However, in the continuous case even 
the simplest linear transformation A'''^ cA'''^ does not 
leave invariant, indeed after rescalling A'-''^ to 

cA'-'^^ we obtain 



^ [inc)Ai] ^[{nc)A2] 



(nc) 



(nc) 



[{nc)Ad 
{nc) 



^(nc) ' 



and so 

J(<*)(c^c<*)) = lim (j„(i(,'")-dlog2n) 



lim 



^a(c-4(nc)) ~ C^l0g2("'C) + C;i0g2 C 



(5.47) 



So I'^'>{cA''''') ^ I '^'> {A' ''''). Situation becomes, however, 
different when we consider l^\cA'-''^). This is because 
we can rewrite J'f ' (c^***' ) as 

I^f{cA^-y)= hm (l^{A'n'>) - dlog^n) 

{M^IZ)) - dlog.n) . (5.48) 



— lim 



Here we have used £(^^0) instead of S^'' because the 
rescalling changes also the volume V of the outcome 
space into cV. A simple consequence of Eq. (5.48) is that 

i<f'(c^<") = i^^^iA^"^)- In fact, when h = (hi,..., ha) 
is an invertible and differentiable (vector) function it is 
simple to rewrite in a fully covariant manner. 

Indeed, realizing that scalar density transforms as 



^(x) 



dy 



9x 



(5.49) 
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(here y = /i(x)) we also know that 



APPENDIX F 



dy 



ax 



m(y) 



(5.50) 



(here m(y) denotes the /i-transformed uniform PDF). 
Then we see that 




a-l 



{I -a) 



I d^yr-^ )m(y) 

Jh(v) \TO(y) 




(5.51) 



If hi and /12 are any two invertible and differentiable 
vector functions so is their composition /12 o hi and then 

j(<i)(_4(<i)) = i^^->{hi{A^^^)) 

d^y(^Smi{y)] 
h,{v) \mi{y)J I 




(1 - a) \Jh20hl{V) 

i'fih2ohi{A^'^)) 



m2(z) 



m2(z) 



(5.52) 



with 

•miiy) 



dy 



dx 

dy 



dx 



= ^(X), ^2(Z) 

= 1/V, m2(z) 



dz 



dy 

dz 



dy 



= mi(y), (5.53) 



and y = hi{x),z = h2{y) = ^2 o hi{x). Thus is 
invariant under the outcome-space reparametrization. In 
addition, if we restrict our consideration only to the class 
of transformations which have also differentiable inverse 
i.e., diffeomorphisms, we see from (5.52) and (5.53) that 
the information measure J'f ' is invariant with respect to 
the group of diffeomorphisms. This fact was firstly real- 
ized by E.T. Jaynes in the context of Shannon's entropy 
[32]. As a matter of fact, when setting a = 1 we obtain 
from (5.52) that 



H{r) = lim 



a^i (1 - a) 



loga 



[ d'y:F{y)logJ^] , 
Jh(v) \m{y) J 



(5.54) 



which precisely coincides with Jaynes's finding [31,32]. 
Entropy (5.54) is also known as the Kullback-Leibler rel- 
ative entropy. 



In this appendix we derive relation (4.14). To start 

we must first identify f„. If we denote Ni{ai) as the 
number of boxes of size I needed to cover the unifractal 
with the singularity exponent a, then 5„ = {f„fc(oj); k G 
Ni{ai),i G N}. Because of the scaling property we must 
set Snk{(ii) = Ck{ai)l°"^ with Ck{ai) weakly I dependent. 
In order to Ia{£n) represent the "ground state" infor- 
mation we must require Cfc(a,) to be a constant (i.e., 
Ckia-ijl) = c{l)). This is so because in such a case our 
lack of information about the multifractal system (pro- 
vided we comply with the scaling of probability) is clearly 
highest. This implies that c = l/J2i as indeed 

/ i k=l i 

(5.55) 

Notice that c is weakly I dependent since Ni{ai)l"-' ^ 
= 1. To proceed further we employ the multifractal 
measure (4.13). There Vn = {Pnk} is the discrete (in- 
tegrated) probability distribution on the covering mesh. 
In case that the limit in (4.13) exists we may define the 
increment of /i^^ {d; I) between a and a + da'm. the small 
I limit as 



E 



Pnk 



(5.56) 



a+da < 



Eq.(5.56) then implies that 

/ N,(a.) 

lim log2 J2 E Pnkiai) 

~^ \ i k 



~log2 / dii^^\a) 

J a 

-h T(a) log2 1 , (5.57) 



and so especially 

iailJ'V) = lim {laiVn) - lai^n)) 

- -J— w (i^gW' 



(5.58) 



Under the condition that the integrals exist relation 
(5.58) represents a well defined (and finite) information 
measure. From the same reasons as in Section III we may 
conclude that X„(/U-p) represents negentropy. Notice that 
similarly as before 



/ 



dnP{a) 



1. 



(5.59) 



This results from the fact that ^((ai)^""*"'^^"^ is Oi 
independent in the small I limit. Actually, 
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da 



= ^ I da n(a)/--^(")+«'*-^(") 
da J 

= In/y da{a - ao) n(a)Z--^(«)+""-^(") 



O + O 



(inz)3/2 ; ' 



(5.60) 



On the last line of (5.60) we have applied Laplace's for- 
mula of the asymptotic calculus [38]. Eq.(5.60) confirms 
our previous assertion as it assures that the vanishing 
of dK{a)/da at Z — > is at least as large as that of 
l/(lnZ)3/2. The consequence of this is that 



K{0) V 1 



(5.61) 



(i^(l))" (K(l))" {K{l)Y 
The latter implies that K{1) = 1 and ergo (5.59) holds. 



APPENDIX G 

We show here an alternative way to obtain the real 
inverse formula for Eq.(4.19). Let us start with the fol- 
lowing observation: 

^vix) = ^ pk = ^pie{log2Pi + x) . (5.62) 

-log2Pk<X I 

Using the limit representation of the step function ^(a;); 
9{x) — Hm exp(— 2~t) , 

together with the functional relation 



eiiog2Pi + x) = e{x) 



log2 Pi 



+ e{-x)e' 



O{x)-eix)0< 



(5.63) 



we may rewrite (5.62) as 
T-p{x) = 9{x) — lim e{x) 
or equivalently 



(-1)"2. 



n=0 



(5.64) 



T^{x) « e{x) ^ ^ — '- 2-'^^(A-/-+i) . (5.65) 



Here the complementary information-distribution func- 
tion of V 

T^{x) = e{x) - Tv{x) = 

- log2 pfc>a;>0 

was defined. The regulator A 1/e. Note that because 
x e [0,-|-oo) we have that a € [l,-|-oo). This is in the 
agreement with the analysis based on the Widder-Stiltjes 
inverse formula. 



APPENDIX H 

In this appendix we derive the reconstruction theorem 
for THC entropy. Starting with Eq.(4.20) we may write 



2ni 



dp 



ioo+a P 
■ioo+a 



= -WIw7/ dpeP^S^{V)+e{x) (5.66) 

ln(4)m J-ioo+a- 

where the step function 6{x) was added and subtracted 
and the Bromwich representation 

^^^^ 2m y_joo+CT '^^ P 
was used. As a result we obtain 

= Wir- / '^P e^"'5„(P) . (5.67) 
ln(4)7rz y_ioo+a 

The inverse Laplace-Stiltjes transformation then gives 

So.{V) = / 2(i-«)- d^^(x) . (5.68) 

(a - 1) 



n=0 
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